Overview

Dataset statistics

Number of variables24
Number of observations15162
Missing cells622
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 MiB
Average record size in memory192.0 B

Variable types

Numeric8
Text12
Categorical3
DateTime1

Alerts

X is highly overall correlated with WardHigh correlation
FID is highly overall correlated with IDHigh correlation
ID is highly overall correlated with FIDHigh correlation
NAICSCode is highly overall correlated with NAICSTitleHigh correlation
Ward is highly overall correlated with XHigh correlation
NAICSTitle is highly overall correlated with NAICSCodeHigh correlation
BIA is highly imbalanced (73.6%)Imbalance
EmplRange has 562 (3.7%) missing valuesMissing
FID is uniformly distributedUniform
FID has unique valuesUnique
ID has unique valuesUnique

Reproduction

Analysis started2023-06-29 16:51:03.629024
Analysis finished2023-06-29 16:51:47.379333
Duration43.75 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

X
Real number (ℝ)

HIGH CORRELATION 

Distinct4448
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-79.653438
Minimum-79.80288
Maximum-79.547367
Zeros0
Zeros (%)0.0%
Negative15162
Negative (%)100.0%
Memory size118.6 KiB
2023-06-29T19:51:47.917755image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-79.80288
5-th percentile-79.74349
Q1-79.679364
median-79.649892
Q3-79.621379
95-th percentile-79.577756
Maximum-79.547367
Range0.25551265
Interquartile range (IQR)0.057984364

Descriptive statistics

Standard deviation0.047663124
Coefficient of variation (CV)-0.00059838125
Kurtosis0.0049222775
Mean-79.653438
Median Absolute Deviation (MAD)0.028897726
Skewness-0.4479999
Sum-1207705.4
Variance0.0022717734
MonotonicityNot monotonic
2023-06-29T19:51:48.860509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-79.64265788 204
 
1.3%
-79.66361487 118
 
0.8%
-79.63790272 114
 
0.8%
-79.71163795 105
 
0.7%
-79.56915738 82
 
0.5%
-79.61380182 65
 
0.4%
-79.7597192 58
 
0.4%
-79.65270754 54
 
0.4%
-79.67778283 50
 
0.3%
-79.63609212 47
 
0.3%
Other values (4438) 14265
94.1%
ValueCountFrequency (%)
-79.80287951 1
 
< 0.1%
-79.80135163 1
 
< 0.1%
-79.79835336 1
 
< 0.1%
-79.79506726 1
 
< 0.1%
-79.79421616 1
 
< 0.1%
-79.78920075 21
0.1%
-79.78907833 2
 
< 0.1%
-79.7884516 2
 
< 0.1%
-79.78684169 21
0.1%
-79.78668142 10
0.1%
ValueCountFrequency (%)
-79.54736685 1
< 0.1%
-79.54998731 2
< 0.1%
-79.55183758 1
< 0.1%
-79.55277418 1
< 0.1%
-79.55286258 1
< 0.1%
-79.55374903 1
< 0.1%
-79.55447862 1
< 0.1%
-79.55476496 2
< 0.1%
-79.55507799 1
< 0.1%
-79.55508634 1
< 0.1%

Y
Real number (ℝ)

Distinct4448
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.613516
Minimum43.484595
Maximum43.732856
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:51:49.443667image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum43.484595
5-th percentile43.523873
Q143.580514
median43.611545
Q343.649946
95-th percentile43.699396
Maximum43.732856
Range0.24826087
Interquartile range (IQR)0.069431469

Descriptive statistics

Standard deviation0.050530368
Coefficient of variation (CV)0.0011585942
Kurtosis-0.53989159
Mean43.613516
Median Absolute Deviation (MAD)0.03375338
Skewness-0.031832227
Sum661268.14
Variance0.0025533181
MonotonicityNot monotonic
2023-06-29T19:51:49.971154image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43.59274869 204
 
1.3%
43.68932775 118
 
0.8%
43.71980161 114
 
0.8%
43.55812565 105
 
0.7%
43.59312365 82
 
0.5%
43.63261974 65
 
0.4%
43.58235601 58
 
0.4%
43.70950797 54
 
0.4%
43.67341442 50
 
0.3%
43.5918369 47
 
0.3%
Other values (4438) 14265
94.1%
ValueCountFrequency (%)
43.48459499 1
< 0.1%
43.48510004 1
< 0.1%
43.4896017 1
< 0.1%
43.49130084 1
< 0.1%
43.49181735 1
< 0.1%
43.49286462 1
< 0.1%
43.49445361 1
< 0.1%
43.4957017 1
< 0.1%
43.49604197 1
< 0.1%
43.4961905 1
< 0.1%
ValueCountFrequency (%)
43.73285585 5
< 0.1%
43.73237782 1
 
< 0.1%
43.73193982 2
 
< 0.1%
43.73066185 1
 
< 0.1%
43.72939899 1
 
< 0.1%
43.72768462 2
 
< 0.1%
43.72553117 2
 
< 0.1%
43.72539371 1
 
< 0.1%
43.72538172 2
 
< 0.1%
43.7247614 2
 
< 0.1%

FID
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct15162
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7581.5
Minimum1
Maximum15162
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:51:50.434049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile759.05
Q13791.25
median7581.5
Q311371.75
95-th percentile14403.95
Maximum15162
Range15161
Interquartile range (IQR)7580.5

Descriptive statistics

Standard deviation4377.0367
Coefficient of variation (CV)0.57733123
Kurtosis-1.2
Mean7581.5
Median Absolute Deviation (MAD)3790.5
Skewness0
Sum1.149507 × 108
Variance19158450
MonotonicityStrictly increasing
2023-06-29T19:51:50.917803image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
10113 1
 
< 0.1%
10101 1
 
< 0.1%
10102 1
 
< 0.1%
10103 1
 
< 0.1%
10104 1
 
< 0.1%
10105 1
 
< 0.1%
10106 1
 
< 0.1%
10107 1
 
< 0.1%
10108 1
 
< 0.1%
Other values (15152) 15152
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
15162 1
< 0.1%
15161 1
< 0.1%
15160 1
< 0.1%
15159 1
< 0.1%
15158 1
< 0.1%
15157 1
< 0.1%
15156 1
< 0.1%
15155 1
< 0.1%
15154 1
< 0.1%
15153 1
< 0.1%

ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct15162
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42908.216
Minimum7
Maximum97467
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:51:51.483010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile2672.1
Q111386.75
median47113
Q384095.25
95-th percentile94470.85
Maximum97467
Range97460
Interquartile range (IQR)72708.5

Descriptive statistics

Standard deviation33675.723
Coefficient of variation (CV)0.78483159
Kurtosis-1.4570963
Mean42908.216
Median Absolute Deviation (MAD)36269.5
Skewness0.3075195
Sum6.5057437 × 108
Variance1.1340543 × 109
MonotonicityNot monotonic
2023-06-29T19:51:51.995734image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2791 1
 
< 0.1%
83852 1
 
< 0.1%
83803 1
 
< 0.1%
83808 1
 
< 0.1%
83822 1
 
< 0.1%
83824 1
 
< 0.1%
83826 1
 
< 0.1%
83833 1
 
< 0.1%
83834 1
 
< 0.1%
83835 1
 
< 0.1%
Other values (15152) 15152
99.9%
ValueCountFrequency (%)
7 1
< 0.1%
10 1
< 0.1%
18 1
< 0.1%
20 1
< 0.1%
21 1
< 0.1%
26 1
< 0.1%
27 1
< 0.1%
34 1
< 0.1%
37 1
< 0.1%
41 1
< 0.1%
ValueCountFrequency (%)
97467 1
< 0.1%
97465 1
< 0.1%
97464 1
< 0.1%
97462 1
< 0.1%
97460 1
< 0.1%
97456 1
< 0.1%
97450 1
< 0.1%
97442 1
< 0.1%
97431 1
< 0.1%
97429 1
< 0.1%

Name
Text

Distinct13831
Distinct (%)91.2%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:51:52.902946image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length111
Median length70
Mean length22.1883
Min length2

Characters and Unicode

Total characters336419
Distinct characters93
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13270 ?
Unique (%)87.5%

Sample

1st rowThai Rice
2nd rowVintners Cellar Britannia Inc.
3rd rowUK Insurance Brokers Inc
4th rowPrestige Pools & Leisure Products Ltd.
5th rowGrocery Cafe
ValueCountFrequency (%)
inc 2832
 
5.4%
1727
 
3.3%
ltd 1450
 
2.8%
canada 873
 
1.7%
centre 636
 
1.2%
services 487
 
0.9%
and 463
 
0.9%
the 439
 
0.8%
of 388
 
0.7%
corp 374
 
0.7%
Other values (12161) 42868
81.6%
2023-06-29T19:51:55.768852image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
37420
 
11.1%
e 25649
 
7.6%
a 24600
 
7.3%
n 21548
 
6.4%
i 19612
 
5.8%
r 19403
 
5.8%
o 18483
 
5.5%
t 17825
 
5.3%
s 14884
 
4.4%
l 12127
 
3.6%
Other values (83) 124868
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 235503
70.0%
Uppercase Letter 53392
 
15.9%
Space Separator 37420
 
11.1%
Other Punctuation 7971
 
2.4%
Decimal Number 777
 
0.2%
Dash Punctuation 758
 
0.2%
Open Punctuation 223
 
0.1%
Close Punctuation 223
 
0.1%
Final Punctuation 109
 
< 0.1%
Math Symbol 34
 
< 0.1%
Other values (4) 9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 25649
10.9%
a 24600
10.4%
n 21548
9.1%
i 19612
 
8.3%
r 19403
 
8.2%
o 18483
 
7.8%
t 17825
 
7.6%
s 14884
 
6.3%
l 12127
 
5.1%
c 11192
 
4.8%
Other values (22) 50180
21.3%
Uppercase Letter
ValueCountFrequency (%)
C 7083
13.3%
S 5644
 
10.6%
I 4371
 
8.2%
M 3520
 
6.6%
P 3446
 
6.5%
L 3385
 
6.3%
A 3372
 
6.3%
T 2958
 
5.5%
D 2486
 
4.7%
B 2218
 
4.2%
Other values (16) 14909
27.9%
Other Punctuation
ValueCountFrequency (%)
. 5451
68.4%
& 1372
 
17.2%
' 488
 
6.1%
, 450
 
5.6%
/ 173
 
2.2%
: 14
 
0.2%
@ 6
 
0.1%
! 6
 
0.1%
; 4
 
0.1%
# 4
 
0.1%
Other values (2) 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 162
20.8%
2 138
17.8%
0 134
17.2%
4 85
10.9%
9 57
 
7.3%
3 51
 
6.6%
8 43
 
5.5%
6 40
 
5.1%
5 39
 
5.0%
7 28
 
3.6%
Math Symbol
ValueCountFrequency (%)
+ 28
82.4%
| 5
 
14.7%
> 1
 
2.9%
Final Punctuation
ValueCountFrequency (%)
107
98.2%
2
 
1.8%
Space Separator
ValueCountFrequency (%)
37420
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 758
100.0%
Open Punctuation
ValueCountFrequency (%)
( 223
100.0%
Close Punctuation
ValueCountFrequency (%)
) 223
100.0%
Format
ValueCountFrequency (%)
3
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Control
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 288895
85.9%
Common 47524
 
14.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 25649
 
8.9%
a 24600
 
8.5%
n 21548
 
7.5%
i 19612
 
6.8%
r 19403
 
6.7%
o 18483
 
6.4%
t 17825
 
6.2%
s 14884
 
5.2%
l 12127
 
4.2%
c 11192
 
3.9%
Other values (48) 103572
35.9%
Common
ValueCountFrequency (%)
37420
78.7%
. 5451
 
11.5%
& 1372
 
2.9%
- 758
 
1.6%
' 488
 
1.0%
, 450
 
0.9%
( 223
 
0.5%
) 223
 
0.5%
/ 173
 
0.4%
1 162
 
0.3%
Other values (25) 804
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 336284
> 99.9%
Punctuation 114
 
< 0.1%
None 21
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
37420
 
11.1%
e 25649
 
7.6%
a 24600
 
7.3%
n 21548
 
6.4%
i 19612
 
5.8%
r 19403
 
5.8%
o 18483
 
5.5%
t 17825
 
5.3%
s 14884
 
4.4%
l 12127
 
3.6%
Other values (73) 124733
37.1%
Punctuation
ValueCountFrequency (%)
107
93.9%
3
 
2.6%
2
 
1.8%
2
 
1.8%
None
ValueCountFrequency (%)
é 13
61.9%
ē 3
 
14.3%
ü 2
 
9.5%
ć 1
 
4.8%
è 1
 
4.8%
ä 1
 
4.8%

EmplRange
Categorical

MISSING 

Distinct9
Distinct (%)0.1%
Missing562
Missing (%)3.7%
Memory size118.6 KiB
1 to 4
5920 
5 to 9
3383 
10 to 19
2256 
20 to 49
1697 
50 to 99
765 
Other values (4)
 
579

Length

Max length10
Median length6
Mean length6.8032877
Min length6

Characters and Unicode

Total characters99328
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1 to 4
2nd row1 to 4
3rd row1 to 4
4th row5 to 9
5th row1 to 4

Common Values

ValueCountFrequency (%)
1 to 4 5920
39.0%
5 to 9 3383
22.3%
10 to 19 2256
 
14.9%
20 to 49 1697
 
11.2%
50 to 99 765
 
5.0%
100 to 299 446
 
2.9%
300 to 499 77
 
0.5%
500 to 999 32
 
0.2%
1000 plus 24
 
0.2%
(Missing) 562
 
3.7%

Length

2023-06-29T19:51:56.329647image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-29T19:51:57.109971image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
to 14576
33.3%
1 5920
13.5%
4 5920
13.5%
5 3383
 
7.7%
9 3383
 
7.7%
10 2256
 
5.2%
19 2256
 
5.2%
20 1697
 
3.9%
49 1697
 
3.9%
99 765
 
1.7%
Other values (9) 1923
 
4.4%

Most occurring characters

ValueCountFrequency (%)
29176
29.4%
t 14576
14.7%
o 14576
14.7%
1 10902
 
11.0%
9 10008
 
10.1%
4 7694
 
7.7%
0 5900
 
5.9%
5 4180
 
4.2%
2 2143
 
2.2%
3 77
 
0.1%
Other values (4) 96
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40904
41.2%
Lowercase Letter 29248
29.4%
Space Separator 29176
29.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 10902
26.7%
9 10008
24.5%
4 7694
18.8%
0 5900
14.4%
5 4180
 
10.2%
2 2143
 
5.2%
3 77
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
t 14576
49.8%
o 14576
49.8%
p 24
 
0.1%
l 24
 
0.1%
u 24
 
0.1%
s 24
 
0.1%
Space Separator
ValueCountFrequency (%)
29176
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 70080
70.6%
Latin 29248
29.4%

Most frequent character per script

Common
ValueCountFrequency (%)
29176
41.6%
1 10902
 
15.6%
9 10008
 
14.3%
4 7694
 
11.0%
0 5900
 
8.4%
5 4180
 
6.0%
2 2143
 
3.1%
3 77
 
0.1%
Latin
ValueCountFrequency (%)
t 14576
49.8%
o 14576
49.8%
p 24
 
0.1%
l 24
 
0.1%
u 24
 
0.1%
s 24
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
29176
29.4%
t 14576
14.7%
o 14576
14.7%
1 10902
 
11.0%
9 10008
 
10.1%
4 7694
 
7.7%
0 5900
 
5.9%
5 4180
 
4.2%
2 2143
 
2.2%
3 77
 
0.1%
Other values (4) 96
 
0.1%

NAICSCode
Real number (ℝ)

HIGH CORRELATION 

Distinct660
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean538694.85
Minimum1
Maximum913910
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:51:58.463006image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile311990
Q1418990
median531210
Q3621510
95-th percentile812116
Maximum913910
Range913909
Interquartile range (IQR)202520

Descriptive statistics

Standard deviation160918.36
Coefficient of variation (CV)0.29871894
Kurtosis-0.65990818
Mean538694.85
Median Absolute Deviation (MAD)91900
Skewness0.23946278
Sum8.1676913 × 109
Variance2.5894717 × 1010
MonotonicityNot monotonic
2023-06-29T19:51:59.078925image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
722512 882
 
5.8%
811111 379
 
2.5%
722511 338
 
2.2%
621210 321
 
2.1%
541110 276
 
1.8%
812115 276
 
1.8%
621110 252
 
1.7%
611110 236
 
1.6%
813110 225
 
1.5%
541212 219
 
1.4%
Other values (650) 11758
77.5%
ValueCountFrequency (%)
1 5
< 0.1%
41611 3
 
< 0.1%
112510 1
 
< 0.1%
112999 1
 
< 0.1%
115110 1
 
< 0.1%
212299 2
 
< 0.1%
213119 2
 
< 0.1%
221119 1
 
< 0.1%
221122 9
0.1%
221210 2
 
< 0.1%
ValueCountFrequency (%)
913910 21
 
0.1%
913140 22
 
0.1%
913130 1
 
< 0.1%
912910 6
 
< 0.1%
912210 5
 
< 0.1%
912190 2
 
< 0.1%
911910 7
 
< 0.1%
911410 1
 
< 0.1%
911390 1
 
< 0.1%
911320 67
0.4%

NAICSTitle
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)0.1%
Missing5
Missing (%)< 0.1%
Memory size118.6 KiB
Retail
2169 
Manufacturing
1813 
Other Services
1760 
Wholesale
1454 
Accommodation
1357 
Other values (15)
6604 

Length

Max length27
Median length16
Mean length11.052715
Min length4

Characters and Unicode

Total characters167526
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowAccommodation
2nd rowManufacturing
3rd rowFinance
4th rowRetail
5th rowRetail

Common Values

ValueCountFrequency (%)
Retail 2169
14.3%
Manufacturing 1813
12.0%
Other Services 1760
11.6%
Wholesale 1454
9.6%
Accommodation 1357
9.0%
Health Care 1330
8.8%
Professional 1328
8.8%
Transportation 723
 
4.8%
Finance 602
 
4.0%
Educational 597
 
3.9%
Other values (10) 2024
13.3%

Length

2023-06-29T19:51:59.666542image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
retail 2169
11.6%
manufacturing 1813
9.7%
other 1761
9.4%
services 1761
9.4%
wholesale 1454
 
7.8%
accommodation 1357
 
7.2%
health 1330
 
7.1%
care 1330
 
7.1%
professional 1328
 
7.1%
transportation 723
 
3.9%
Other values (16) 3735
19.9%

Most occurring characters

ValueCountFrequency (%)
a 17544
 
10.5%
e 16402
 
9.8%
t 14071
 
8.4%
i 13140
 
7.8%
n 11901
 
7.1%
o 11732
 
7.0%
r 10991
 
6.6%
l 8859
 
5.3%
s 8387
 
5.0%
c 8174
 
4.9%
Other values (25) 46325
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 145161
86.6%
Uppercase Letter 18761
 
11.2%
Space Separator 3604
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 17544
12.1%
e 16402
11.3%
t 14071
9.7%
i 13140
9.1%
n 11901
8.2%
o 11732
8.1%
r 10991
7.6%
l 8859
 
6.1%
s 8387
 
5.8%
c 8174
 
5.6%
Other values (10) 23960
16.5%
Uppercase Letter
ValueCountFrequency (%)
R 2538
13.5%
A 2202
11.7%
M 1910
10.2%
C 1882
10.0%
S 1761
9.4%
O 1761
9.4%
P 1471
7.8%
W 1454
7.8%
H 1330
7.1%
E 966
 
5.1%
Other values (4) 1486
7.9%
Space Separator
ValueCountFrequency (%)
3604
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 163922
97.8%
Common 3604
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 17544
10.7%
e 16402
 
10.0%
t 14071
 
8.6%
i 13140
 
8.0%
n 11901
 
7.3%
o 11732
 
7.2%
r 10991
 
6.7%
l 8859
 
5.4%
s 8387
 
5.1%
c 8174
 
5.0%
Other values (24) 42721
26.1%
Common
ValueCountFrequency (%)
3604
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 167526
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 17544
 
10.5%
e 16402
 
9.8%
t 14071
 
8.4%
i 13140
 
7.8%
n 11901
 
7.1%
o 11732
 
7.0%
r 10991
 
6.6%
l 8859
 
5.3%
s 8387
 
5.0%
c 8174
 
4.9%
Other values (25) 46325
27.7%
Distinct659
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:02.770478image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length107
Median length70
Mean length35.019852
Min length6

Characters and Unicode

Total characters530971
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)0.5%

Sample

1st rowLimited-service eating places
2nd rowWineries
3rd rowInsurance Agencies and Brokerages
4th rowAll Other Miscellaneous Store Retailers (except Beer and Wine-Making Supplies Stores)
5th rowSupermarkets and Other Grocery (except Convenience) Stores
ValueCountFrequency (%)
and 6250
 
9.8%
other 3471
 
5.4%
stores 1861
 
2.9%
offices 1697
 
2.6%
services 1687
 
2.6%
of 1634
 
2.6%
all 1618
 
2.5%
wholesaler-distributors 1389
 
2.2%
manufacturing 1372
 
2.1%
places 903
 
1.4%
Other values (912) 42178
65.8%
2023-06-29T19:52:06.128734image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 53747
 
10.1%
49254
 
9.3%
i 38531
 
7.3%
r 36413
 
6.9%
a 35230
 
6.6%
t 35001
 
6.6%
n 34744
 
6.5%
s 30804
 
5.8%
o 26506
 
5.0%
l 22078
 
4.2%
Other values (49) 168663
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 421250
79.3%
Uppercase Letter 53270
 
10.0%
Space Separator 49391
 
9.3%
Dash Punctuation 3604
 
0.7%
Other Punctuation 2046
 
0.4%
Close Punctuation 705
 
0.1%
Open Punctuation 705
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 53747
12.8%
i 38531
9.1%
r 36413
8.6%
a 35230
8.4%
t 35001
8.3%
n 34744
 
8.2%
s 30804
 
7.3%
o 26506
 
6.3%
l 22078
 
5.2%
c 20760
 
4.9%
Other values (16) 87436
20.8%
Uppercase Letter
ValueCountFrequency (%)
S 7658
14.4%
O 5795
10.9%
A 4811
 
9.0%
C 4613
 
8.7%
M 4139
 
7.8%
P 3506
 
6.6%
D 2925
 
5.5%
W 2312
 
4.3%
L 2280
 
4.3%
E 2205
 
4.1%
Other values (14) 13026
24.5%
Other Punctuation
ValueCountFrequency (%)
, 1754
85.7%
' 131
 
6.4%
& 101
 
4.9%
. 60
 
2.9%
Space Separator
ValueCountFrequency (%)
49254
99.7%
  137
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 3604
100.0%
Close Punctuation
ValueCountFrequency (%)
) 705
100.0%
Open Punctuation
ValueCountFrequency (%)
( 705
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 474520
89.4%
Common 56451
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 53747
 
11.3%
i 38531
 
8.1%
r 36413
 
7.7%
a 35230
 
7.4%
t 35001
 
7.4%
n 34744
 
7.3%
s 30804
 
6.5%
o 26506
 
5.6%
l 22078
 
4.7%
c 20760
 
4.4%
Other values (40) 140706
29.7%
Common
ValueCountFrequency (%)
49254
87.3%
- 3604
 
6.4%
, 1754
 
3.1%
) 705
 
1.2%
( 705
 
1.2%
  137
 
0.2%
' 131
 
0.2%
& 101
 
0.2%
. 60
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 530834
> 99.9%
None 137
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 53747
 
10.1%
49254
 
9.3%
i 38531
 
7.3%
r 36413
 
6.9%
a 35230
 
6.6%
t 35001
 
6.6%
n 34744
 
6.5%
s 30804
 
5.8%
o 26506
 
5.0%
l 22078
 
4.2%
Other values (48) 168526
31.7%
None
ValueCountFrequency (%)
  137
100.0%

Phone
Text

Distinct14222
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:06.946326image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length20
Median length12
Mean length11.662446
Min length1

Characters and Unicode

Total characters176826
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13832 ?
Unique (%)91.2%

Sample

1st row905-821-8999
2nd row905-813-3436
3rd row905-567-7003
4th row905-542-1505
5th row905-821-8960
ValueCountFrequency (%)
905-615-3200 11
 
0.1%
905-670-4070 7
 
< 0.1%
905-624-3811 6
 
< 0.1%
905-615-3777 5
 
< 0.1%
905-677-9354 5
 
< 0.1%
905-896-0210 5
 
< 0.1%
905-273-5888 5
 
< 0.1%
905-282-6800 5
 
< 0.1%
905-629-1873 5
 
< 0.1%
905-271-2400 4
 
< 0.1%
Other values (14215) 14644
99.6%
2023-06-29T19:52:08.215905image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 29367
16.6%
0 25687
14.5%
5 22084
12.5%
9 21596
12.2%
6 13828
7.8%
2 13660
7.7%
7 11935
6.7%
8 11568
 
6.5%
4 9578
 
5.4%
1 9520
 
5.4%
Other values (4) 8003
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 146986
83.1%
Dash Punctuation 29371
 
16.6%
Space Separator 468
 
0.3%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 25687
17.5%
5 22084
15.0%
9 21596
14.7%
6 13828
9.4%
2 13660
9.3%
7 11935
8.1%
8 11568
7.9%
4 9578
 
6.5%
1 9520
 
6.5%
3 7530
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
- 29367
> 99.9%
4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
468
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 176825
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 29367
16.6%
0 25687
14.5%
5 22084
12.5%
9 21596
12.2%
6 13828
7.8%
2 13660
7.7%
7 11935
6.7%
8 11568
 
6.5%
4 9578
 
5.4%
1 9520
 
5.4%
Other values (3) 8002
 
4.5%
Latin
ValueCountFrequency (%)
x 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 176822
> 99.9%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 29367
16.6%
0 25687
14.5%
5 22084
12.5%
9 21596
12.2%
6 13828
7.8%
2 13660
7.7%
7 11935
6.7%
8 11568
 
6.5%
4 9578
 
5.4%
1 9520
 
5.4%
Other values (3) 7999
 
4.5%
Punctuation
ValueCountFrequency (%)
4
100.0%

Fax
Text

Distinct7388
Distinct (%)48.7%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:08.963934image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length14
Median length13
Mean length6.5524997
Min length1

Characters and Unicode

Total characters99349
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7198 ?
Unique (%)47.5%

Sample

1st row
2nd row
3rd row905-567-6003
4th row
5th row
ValueCountFrequency (%)
905-361-6401 8
 
0.1%
905-896-9380 5
 
0.1%
905-282-1508 5
 
0.1%
1-855-552-7329 5
 
0.1%
905-273-5999 4
 
0.1%
905-625-4815 4
 
0.1%
905-822-2673 4
 
0.1%
905-819-1331 3
 
< 0.1%
905-542-0987 3
 
< 0.1%
905-677-9093 3
 
< 0.1%
Other values (7379) 7568
99.4%
2023-06-29T19:52:10.216139image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 15454
15.6%
0 12492
12.6%
5 12201
12.3%
9 11816
11.9%
7554
7.6%
6 7375
7.4%
2 6972
7.0%
8 6220
6.3%
7 5935
 
6.0%
1 4941
 
5.0%
Other values (3) 8389
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76340
76.8%
Dash Punctuation 15454
 
15.6%
Space Separator 7554
 
7.6%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 12492
16.4%
5 12201
16.0%
9 11816
15.5%
6 7375
9.7%
2 6972
9.1%
8 6220
8.1%
7 5935
7.8%
1 4941
 
6.5%
4 4359
 
5.7%
3 4029
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 15454
100.0%
Space Separator
ValueCountFrequency (%)
7554
100.0%
Lowercase Letter
ValueCountFrequency (%)
t 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 99348
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 15454
15.6%
0 12492
12.6%
5 12201
12.3%
9 11816
11.9%
7554
7.6%
6 7375
7.4%
2 6972
7.0%
8 6220
6.3%
7 5935
 
6.0%
1 4941
 
5.0%
Other values (2) 8388
8.4%
Latin
ValueCountFrequency (%)
t 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 99349
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 15454
15.6%
0 12492
12.6%
5 12201
12.3%
9 11816
11.9%
7554
7.6%
6 7375
7.4%
2 6972
7.0%
8 6220
6.3%
7 5935
 
6.0%
1 4941
 
5.0%
Other values (3) 8389
8.4%
Distinct1998
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:10.964198image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length20
Median length1
Mean length2.7887482
Min length1

Characters and Unicode

Total characters42283
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1932 ?
Unique (%)12.7%

Sample

1st row
2nd row
3rd row
4th row
5th row
ValueCountFrequency (%)
1-800-465-2422 7
 
0.3%
1-800-769-2511 7
 
0.3%
1-855-552-7467 5
 
0.2%
1-800-472-6842 5
 
0.2%
1-800-879-2847 4
 
0.2%
1-877-849-3637 4
 
0.2%
1-800-668-1179 3
 
0.1%
1-800-254-0778 3
 
0.1%
1-800-387-9754 3
 
0.1%
1-800-895-5295 3
 
0.1%
Other values (1991) 2048
97.9%
2023-06-29T19:52:12.265564image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13078
30.9%
- 6232
14.7%
8 4459
 
10.5%
1 3103
 
7.3%
6 2642
 
6.2%
0 2614
 
6.2%
7 2302
 
5.4%
2 1824
 
4.3%
5 1773
 
4.2%
3 1582
 
3.7%
Other values (4) 2674
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 22971
54.3%
Space Separator 13078
30.9%
Dash Punctuation 6233
 
14.7%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 4459
19.4%
1 3103
13.5%
6 2642
11.5%
0 2614
11.4%
7 2302
10.0%
2 1824
7.9%
5 1773
 
7.7%
3 1582
 
6.9%
4 1500
 
6.5%
9 1172
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
- 6232
> 99.9%
1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
13078
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42282
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
13078
30.9%
- 6232
14.7%
8 4459
 
10.5%
1 3103
 
7.3%
6 2642
 
6.2%
0 2614
 
6.2%
7 2302
 
5.4%
2 1824
 
4.3%
5 1773
 
4.2%
3 1582
 
3.7%
Other values (3) 2673
 
6.3%
Latin
ValueCountFrequency (%)
x 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42282
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13078
30.9%
- 6232
14.7%
8 4459
 
10.5%
1 3103
 
7.3%
6 2642
 
6.2%
0 2614
 
6.2%
7 2302
 
5.4%
2 1824
 
4.3%
5 1773
 
4.2%
3 1582
 
3.7%
Other values (3) 2673
 
6.3%
Punctuation
ValueCountFrequency (%)
1
100.0%

EMail
Text

Distinct10153
Distinct (%)67.0%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:12.999688image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length72
Median length60
Mean length15.725828
Min length1

Characters and Unicode

Total characters238435
Distinct characters75
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10018 ?
Unique (%)66.1%

Sample

1st row
2nd rowvintnersgta@gmail.com
3rd rowInfo@ukinsurance.ca
4th rowinfo@prestigepools.ca
5th rowTheNextVentureInc@outlook.com
ValueCountFrequency (%)
info@taxwide.com 5
 
< 0.1%
insure@all-risks.com 5
 
< 0.1%
info@classicbrand.ca 4
 
< 0.1%
info@baylismedical.com 4
 
< 0.1%
info@attarmetals.com 3
 
< 0.1%
info@somethinsweet.ca 3
 
< 0.1%
info@camcartage.com 3
 
< 0.1%
3
 
< 0.1%
info@mnsinfo.org 2
 
< 0.1%
paul@bncc.ca 2
 
< 0.1%
Other values (10164) 10313
99.7%
2023-06-29T19:52:14.838518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 21510
 
9.0%
o 21248
 
8.9%
c 18010
 
7.6%
i 16200
 
6.8%
e 15448
 
6.5%
m 14016
 
5.9%
n 13489
 
5.7%
s 12442
 
5.2%
r 11443
 
4.8%
. 11301
 
4.7%
Other values (65) 83328
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 206641
86.7%
Other Punctuation 21676
 
9.1%
Space Separator 4916
 
2.1%
Decimal Number 2976
 
1.2%
Uppercase Letter 1688
 
0.7%
Dash Punctuation 379
 
0.2%
Connector Punctuation 154
 
0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 21510
10.4%
o 21248
10.3%
c 18010
 
8.7%
i 16200
 
7.8%
e 15448
 
7.5%
m 14016
 
6.8%
n 13489
 
6.5%
s 12442
 
6.0%
r 11443
 
5.5%
t 10768
 
5.2%
Other values (16) 52067
25.2%
Uppercase Letter
ValueCountFrequency (%)
I 368
21.8%
S 203
12.0%
M 154
 
9.1%
A 122
 
7.2%
C 99
 
5.9%
D 86
 
5.1%
R 69
 
4.1%
P 63
 
3.7%
T 59
 
3.5%
H 54
 
3.2%
Other values (16) 411
24.3%
Decimal Number
ValueCountFrequency (%)
1 547
18.4%
0 492
16.5%
2 441
14.8%
3 251
8.4%
5 229
7.7%
6 217
 
7.3%
4 217
 
7.3%
7 205
 
6.9%
8 191
 
6.4%
9 186
 
6.2%
Other Punctuation
ValueCountFrequency (%)
. 11301
52.1%
@ 10337
47.7%
/ 23
 
0.1%
& 7
 
< 0.1%
, 5
 
< 0.1%
' 2
 
< 0.1%
: 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4916
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 379
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 154
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 208329
87.4%
Common 30106
 
12.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 21510
10.3%
o 21248
 
10.2%
c 18010
 
8.6%
i 16200
 
7.8%
e 15448
 
7.4%
m 14016
 
6.7%
n 13489
 
6.5%
s 12442
 
6.0%
r 11443
 
5.5%
t 10768
 
5.2%
Other values (42) 53755
25.8%
Common
ValueCountFrequency (%)
. 11301
37.5%
@ 10337
34.3%
4916
16.3%
1 547
 
1.8%
0 492
 
1.6%
2 441
 
1.5%
- 379
 
1.3%
3 251
 
0.8%
5 229
 
0.8%
6 217
 
0.7%
Other values (13) 996
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 238434
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 21510
 
9.0%
o 21248
 
8.9%
c 18010
 
7.6%
i 16200
 
6.8%
e 15448
 
6.5%
m 14016
 
5.9%
n 13489
 
5.7%
s 12442
 
5.2%
r 11443
 
4.8%
. 11301
 
4.7%
Other values (64) 83327
34.9%
Punctuation
ValueCountFrequency (%)
1
100.0%
Distinct9733
Distinct (%)64.2%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:16.568545image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length50
Median length43
Mean length14.569252
Min length1

Characters and Unicode

Total characters220899
Distinct characters73
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9145 ?
Unique (%)60.3%

Sample

1st rowwww.thairice.ca
2nd rowwww.vintnerscellargta.com
3rd rowwww.ukinsurance.ca
4th rowwww.prestigepools.ca
5th row
ValueCountFrequency (%)
www.dpcdsb.org 43
 
0.4%
www.subway.com 42
 
0.4%
www.timhortons.com 42
 
0.4%
www.shoppersdrugmart.ca 24
 
0.2%
www.petro-canada.ca 21
 
0.2%
www.mississauga.ca/portal/residents/fire 19
 
0.2%
www.starbucks.ca 17
 
0.2%
www.shell.ca 17
 
0.2%
www.td.com 16
 
0.1%
www.mcdonalds.ca 16
 
0.1%
Other values (9726) 10809
97.7%
2023-06-29T19:52:17.801886image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
w 34747
15.7%
. 22293
 
10.1%
c 17440
 
7.9%
a 17290
 
7.8%
o 15563
 
7.0%
e 12891
 
5.8%
m 10686
 
4.8%
s 9965
 
4.5%
i 9863
 
4.5%
r 9767
 
4.4%
Other values (63) 60394
27.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 192924
87.3%
Other Punctuation 22486
 
10.2%
Space Separator 4134
 
1.9%
Dash Punctuation 495
 
0.2%
Decimal Number 474
 
0.2%
Uppercase Letter 374
 
0.2%
Math Symbol 9
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
w 34747
18.0%
c 17440
 
9.0%
a 17290
 
9.0%
o 15563
 
8.1%
e 12891
 
6.7%
m 10686
 
5.5%
s 9965
 
5.2%
i 9863
 
5.1%
r 9767
 
5.1%
t 9151
 
4.7%
Other values (16) 45561
23.6%
Uppercase Letter
ValueCountFrequency (%)
A 30
 
8.0%
C 28
 
7.5%
S 28
 
7.5%
T 25
 
6.7%
R 23
 
6.1%
M 23
 
6.1%
P 22
 
5.9%
I 21
 
5.6%
E 20
 
5.3%
B 19
 
5.1%
Other values (16) 135
36.1%
Decimal Number
ValueCountFrequency (%)
2 97
20.5%
1 93
19.6%
0 74
15.6%
4 64
13.5%
3 37
 
7.8%
6 28
 
5.9%
9 28
 
5.9%
5 20
 
4.2%
8 19
 
4.0%
7 14
 
3.0%
Other Punctuation
ValueCountFrequency (%)
. 22293
99.1%
/ 182
 
0.8%
& 5
 
< 0.1%
@ 4
 
< 0.1%
\ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4134
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 495
100.0%
Math Symbol
ValueCountFrequency (%)
~ 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 193298
87.5%
Common 27601
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 34747
18.0%
c 17440
 
9.0%
a 17290
 
8.9%
o 15563
 
8.1%
e 12891
 
6.7%
m 10686
 
5.5%
s 9965
 
5.2%
i 9863
 
5.1%
r 9767
 
5.1%
t 9151
 
4.7%
Other values (42) 45935
23.8%
Common
ValueCountFrequency (%)
. 22293
80.8%
4134
 
15.0%
- 495
 
1.8%
/ 182
 
0.7%
2 97
 
0.4%
1 93
 
0.3%
0 74
 
0.3%
4 64
 
0.2%
3 37
 
0.1%
6 28
 
0.1%
Other values (11) 104
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 220899
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
w 34747
15.7%
. 22293
 
10.1%
c 17440
 
7.9%
a 17290
 
7.8%
o 15563
 
7.0%
e 12891
 
5.8%
m 10686
 
4.8%
s 9965
 
4.5%
i 9863
 
4.5%
r 9767
 
4.4%
Other values (63) 60394
27.3%

StreetNo
Real number (ℝ)

Distinct2848
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2947.7062
Minimum1
Maximum7895
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:18.317957image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile55
Q11031.25
median2395
Q35120
95-th percentile7070
Maximum7895
Range7894
Interquartile range (IQR)4088.75

Descriptive statistics

Standard deviation2362.5679
Coefficient of variation (CV)0.80149369
Kurtosis-1.0709211
Mean2947.7062
Median Absolute Deviation (MAD)1695
Skewness0.51695337
Sum44693122
Variance5581727.3
MonotonicityNot monotonic
2023-06-29T19:52:18.816125image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 234
 
1.5%
7205 121
 
0.8%
5100 117
 
0.8%
1250 97
 
0.6%
1 83
 
0.5%
1550 76
 
0.5%
2425 64
 
0.4%
50 62
 
0.4%
4141 56
 
0.4%
1200 52
 
0.3%
Other values (2838) 14200
93.7%
ValueCountFrequency (%)
1 83
0.5%
2 37
0.2%
3 34
0.2%
4 27
 
0.2%
5 1
 
< 0.1%
6 6
 
< 0.1%
7 6
 
< 0.1%
8 4
 
< 0.1%
9 4
 
< 0.1%
10 27
 
0.2%
ValueCountFrequency (%)
7895 25
0.2%
7890 1
 
< 0.1%
7885 14
0.1%
7880 1
 
< 0.1%
7875 8
 
0.1%
7860 1
 
< 0.1%
7855 1
 
< 0.1%
7850 1
 
< 0.1%
7840 1
 
< 0.1%
7832 1
 
< 0.1%
Distinct605
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:19.734961image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length26
Median length21
Mean length11.992151
Min length3

Characters and Unicode

Total characters181825
Distinct characters53
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique140 ?
Unique (%)0.9%

Sample

1st rowBritannia Rd W
2nd rowBritannia Rd W
3rd rowBritannia Rd W
4th rowBritannia Rd W
5th rowBritannia Rd W
ValueCountFrequency (%)
rd 5467
 
15.1%
dr 3509
 
9.7%
e 2387
 
6.6%
st 1898
 
5.3%
blvd 1613
 
4.5%
w 1419
 
3.9%
dundas 963
 
2.7%
ave 817
 
2.3%
centre 513
 
1.4%
lakeshore 507
 
1.4%
Other values (607) 17036
47.2%
2023-06-29T19:52:21.647726image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20967
 
11.5%
r 15034
 
8.3%
e 14095
 
7.8%
a 11425
 
6.3%
d 10823
 
6.0%
n 9648
 
5.3%
t 9301
 
5.1%
i 8576
 
4.7%
o 7242
 
4.0%
l 6386
 
3.5%
Other values (43) 68328
37.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 124348
68.4%
Uppercase Letter 36410
 
20.0%
Space Separator 20967
 
11.5%
Dash Punctuation 89
 
< 0.1%
Other Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 15034
12.1%
e 14095
11.3%
a 11425
9.2%
d 10823
8.7%
n 9648
 
7.8%
t 9301
 
7.5%
i 8576
 
6.9%
o 7242
 
5.8%
l 6386
 
5.1%
s 5392
 
4.3%
Other values (15) 26426
21.3%
Uppercase Letter
ValueCountFrequency (%)
R 6061
16.6%
D 5676
15.6%
S 3663
10.1%
E 3248
8.9%
B 2839
7.8%
C 2583
7.1%
W 2289
 
6.3%
A 1867
 
5.1%
M 1824
 
5.0%
T 1262
 
3.5%
Other values (14) 5098
14.0%
Other Punctuation
ValueCountFrequency (%)
' 10
90.9%
. 1
 
9.1%
Space Separator
ValueCountFrequency (%)
20967
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 160758
88.4%
Common 21067
 
11.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 15034
 
9.4%
e 14095
 
8.8%
a 11425
 
7.1%
d 10823
 
6.7%
n 9648
 
6.0%
t 9301
 
5.8%
i 8576
 
5.3%
o 7242
 
4.5%
l 6386
 
4.0%
R 6061
 
3.8%
Other values (39) 62167
38.7%
Common
ValueCountFrequency (%)
20967
99.5%
- 89
 
0.4%
' 10
 
< 0.1%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 181825
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20967
 
11.5%
r 15034
 
8.3%
e 14095
 
7.8%
a 11425
 
6.3%
d 10823
 
6.0%
n 9648
 
5.3%
t 9301
 
5.1%
i 8576
 
4.7%
o 7242
 
4.0%
l 6386
 
3.5%
Other values (43) 68328
37.6%
Distinct2676
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:23.020058image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.9990107
Min length6

Characters and Unicode

Total characters106119
Distinct characters37
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique845 ?
Unique (%)5.6%

Sample

1st rowL5V 1N2
2nd rowL5V 1N2
3rd rowL5M 4Y4
4th rowL5M 4Y4
5th rowL5M 4Y4
ValueCountFrequency (%)
l4w 2434
 
8.0%
l5t 1569
 
5.2%
l5n 1183
 
3.9%
l4z 1013
 
3.3%
l5b 872
 
2.9%
l5l 829
 
2.7%
l5s 786
 
2.6%
l5m 716
 
2.4%
l4t 686
 
2.3%
l5a 632
 
2.1%
Other values (1026) 19587
64.6%
2023-06-29T19:52:25.507109image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
L 16727
15.8%
15146
14.3%
5 12182
11.5%
4 9438
 
8.9%
1 7453
 
7.0%
2 5123
 
4.8%
W 3173
 
3.0%
3 3169
 
3.0%
T 2860
 
2.7%
6 2221
 
2.1%
Other values (27) 28627
27.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 45484
42.9%
Decimal Number 45483
42.9%
Space Separator 15146
 
14.3%
Lowercase Letter 6
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 16727
36.8%
W 3173
 
7.0%
T 2860
 
6.3%
N 1853
 
4.1%
A 1803
 
4.0%
B 1728
 
3.8%
Z 1682
 
3.7%
C 1589
 
3.5%
V 1497
 
3.3%
M 1441
 
3.2%
Other values (12) 11131
24.5%
Decimal Number
ValueCountFrequency (%)
5 12182
26.8%
4 9438
20.8%
1 7453
16.4%
2 5123
11.3%
3 3169
 
7.0%
6 2221
 
4.9%
9 1844
 
4.1%
8 1820
 
4.0%
7 1605
 
3.5%
0 628
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
k 2
33.3%
l 2
33.3%
c 1
16.7%
g 1
16.7%
Space Separator
ValueCountFrequency (%)
15146
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60629
57.1%
Latin 45490
42.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 16727
36.8%
W 3173
 
7.0%
T 2860
 
6.3%
N 1853
 
4.1%
A 1803
 
4.0%
B 1728
 
3.8%
Z 1682
 
3.7%
C 1589
 
3.5%
V 1497
 
3.3%
M 1441
 
3.2%
Other values (16) 11137
24.5%
Common
ValueCountFrequency (%)
15146
25.0%
5 12182
20.1%
4 9438
15.6%
1 7453
12.3%
2 5123
 
8.4%
3 3169
 
5.2%
6 2221
 
3.7%
9 1844
 
3.0%
8 1820
 
3.0%
7 1605
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106119
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 16727
15.8%
15146
14.3%
5 12182
11.5%
4 9438
 
8.9%
1 7453
 
7.0%
2 5123
 
4.8%
W 3173
 
3.0%
3 3169
 
3.0%
T 2860
 
2.7%
6 2221
 
2.1%
Other values (27) 28627
27.0%
Distinct71
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:26.394718image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length23
Median length1
Mean length1.2499011
Min length1

Characters and Unicode

Total characters18951
Distinct characters50
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)0.2%

Sample

1st row
2nd row
3rd row
4th row
5th row
ValueCountFrequency (%)
bldg 633
43.5%
1 164
 
11.3%
2 162
 
11.1%
a 71
 
4.9%
3 60
 
4.1%
b 59
 
4.1%
4 42
 
2.9%
tower 27
 
1.9%
c 26
 
1.8%
plaza 21
 
1.4%
Other values (42) 189
 
13.0%
2023-06-29T19:52:27.681703image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15129
79.8%
B 697
 
3.7%
l 669
 
3.5%
g 650
 
3.4%
d 638
 
3.4%
1 193
 
1.0%
2 171
 
0.9%
a 86
 
0.5%
A 73
 
0.4%
3 65
 
0.3%
Other values (40) 580
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Space Separator 15129
79.8%
Lowercase Letter 2331
 
12.3%
Uppercase Letter 954
 
5.0%
Decimal Number 534
 
2.8%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 697
73.1%
A 73
 
7.7%
T 27
 
2.8%
E 27
 
2.8%
C 27
 
2.8%
P 21
 
2.2%
K 18
 
1.9%
H 15
 
1.6%
W 12
 
1.3%
D 9
 
0.9%
Other values (9) 28
 
2.9%
Lowercase Letter
ValueCountFrequency (%)
l 669
28.7%
g 650
27.9%
d 638
27.4%
a 86
 
3.7%
e 58
 
2.5%
r 48
 
2.1%
t 34
 
1.5%
o 32
 
1.4%
s 30
 
1.3%
w 28
 
1.2%
Other values (9) 58
 
2.5%
Decimal Number
ValueCountFrequency (%)
1 193
36.1%
2 171
32.0%
3 65
 
12.2%
4 42
 
7.9%
5 15
 
2.8%
6 12
 
2.2%
9 11
 
2.1%
7 10
 
1.9%
0 10
 
1.9%
8 5
 
0.9%
Space Separator
ValueCountFrequency (%)
15129
100.0%
Other Punctuation
ValueCountFrequency (%)
& 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15666
82.7%
Latin 3285
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 697
21.2%
l 669
20.4%
g 650
19.8%
d 638
19.4%
a 86
 
2.6%
A 73
 
2.2%
e 58
 
1.8%
r 48
 
1.5%
t 34
 
1.0%
o 32
 
1.0%
Other values (28) 300
9.1%
Common
ValueCountFrequency (%)
15129
96.6%
1 193
 
1.2%
2 171
 
1.1%
3 65
 
0.4%
4 42
 
0.3%
5 15
 
0.1%
6 12
 
0.1%
9 11
 
0.1%
7 10
 
0.1%
0 10
 
0.1%
Other values (2) 8
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18951
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15129
79.8%
B 697
 
3.7%
l 669
 
3.5%
g 650
 
3.4%
d 638
 
3.4%
1 193
 
1.0%
2 171
 
0.9%
a 86
 
0.5%
A 73
 
0.4%
3 65
 
0.3%
Other values (40) 580
 
3.1%

UnitNo
Text

Distinct1853
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:28.466020image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length64
Median length1
Mean length2.2480543
Min length1

Characters and Unicode

Total characters34085
Distinct characters66
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1253 ?
Unique (%)8.3%

Sample

1st row2
2nd row16
3rd row200
4th row5
5th row4
ValueCountFrequency (%)
1 628
 
5.2%
2 475
 
3.9%
3 449
 
3.7%
4 408
 
3.4%
5 369
 
3.0%
to 344
 
2.8%
6 343
 
2.8%
7 336
 
2.8%
321
 
2.7%
8 303
 
2.5%
Other values (1435) 8128
67.2%
2023-06-29T19:52:29.805035image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6492
19.0%
1 5632
16.5%
2 3654
10.7%
0 3510
10.3%
3 2013
 
5.9%
4 1671
 
4.9%
5 1388
 
4.1%
6 1153
 
3.4%
& 1079
 
3.2%
7 972
 
2.9%
Other values (56) 6521
19.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21573
63.3%
Space Separator 6492
 
19.0%
Lowercase Letter 2346
 
6.9%
Uppercase Letter 2063
 
6.1%
Other Punctuation 1245
 
3.7%
Dash Punctuation 331
 
1.0%
Open Punctuation 17
 
< 0.1%
Close Punctuation 17
 
< 0.1%
Control 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 603
29.2%
B 496
24.0%
C 202
 
9.8%
F 144
 
7.0%
D 109
 
5.3%
E 108
 
5.2%
L 66
 
3.2%
G 59
 
2.9%
H 54
 
2.6%
K 42
 
2.0%
Other values (16) 180
 
8.7%
Lowercase Letter
ValueCountFrequency (%)
o 667
28.4%
t 484
20.6%
r 201
 
8.6%
l 183
 
7.8%
e 133
 
5.7%
n 86
 
3.7%
s 85
 
3.6%
a 72
 
3.1%
f 70
 
3.0%
i 68
 
2.9%
Other values (11) 297
12.7%
Decimal Number
ValueCountFrequency (%)
1 5632
26.1%
2 3654
16.9%
0 3510
16.3%
3 2013
 
9.3%
4 1671
 
7.7%
5 1388
 
6.4%
6 1153
 
5.3%
7 972
 
4.5%
8 873
 
4.0%
9 707
 
3.3%
Other Punctuation
ValueCountFrequency (%)
& 1079
86.7%
, 159
 
12.8%
/ 6
 
0.5%
# 1
 
0.1%
Space Separator
ValueCountFrequency (%)
6492
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 331
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29676
87.1%
Latin 4409
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 667
15.1%
A 603
13.7%
B 496
11.2%
t 484
11.0%
C 202
 
4.6%
r 201
 
4.6%
l 183
 
4.2%
F 144
 
3.3%
e 133
 
3.0%
D 109
 
2.5%
Other values (37) 1187
26.9%
Common
ValueCountFrequency (%)
6492
21.9%
1 5632
19.0%
2 3654
12.3%
0 3510
11.8%
3 2013
 
6.8%
4 1671
 
5.6%
5 1388
 
4.7%
6 1153
 
3.9%
& 1079
 
3.6%
7 972
 
3.3%
Other values (9) 2112
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34085
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6492
19.0%
1 5632
16.5%
2 3654
10.7%
0 3510
10.3%
3 2013
 
5.9%
4 1671
 
4.9%
5 1388
 
4.1%
6 1153
 
3.4%
& 1079
 
3.2%
7 972
 
2.9%
Other values (56) 6521
19.1%
Distinct251
Distinct (%)1.7%
Missing55
Missing (%)0.4%
Memory size118.6 KiB
Minimum2012-10-22 00:00:00+00:00
Maximum2022-09-30 00:00:00+00:00
2023-06-29T19:52:30.398055image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:52:32.381916image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

CHArea
Text

Distinct56
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:33.311384image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length27
Median length23
Mean length16.369872
Min length7

Characters and Unicode

Total characters248200
Distinct characters44
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEast Credit NHD
2nd rowEast Credit NHD
3rd rowStreetsville NHD
4th rowStreetsville NHD
5th rowStreetsville NHD
ValueCountFrequency (%)
ea 7932
19.0%
northeast 4466
 
10.7%
west 4423
 
10.6%
nhd 2822
 
6.8%
park 1807
 
4.3%
east 1776
 
4.3%
cc 1640
 
3.9%
business 1605
 
3.8%
gateway 1355
 
3.2%
dt 1292
 
3.1%
Other values (45) 12637
30.3%
2023-06-29T19:52:34.655877image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26593
 
10.7%
e 22314
 
9.0%
t 20985
 
8.5%
s 18670
 
7.5%
a 16564
 
6.7%
r 12957
 
5.2%
o 11634
 
4.7%
E 10593
 
4.3%
i 9082
 
3.7%
A 9071
 
3.7%
Other values (34) 89737
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 148919
60.0%
Uppercase Letter 60206
24.3%
Space Separator 26593
 
10.7%
Close Punctuation 5919
 
2.4%
Open Punctuation 5919
 
2.4%
Dash Punctuation 644
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 22314
15.0%
t 20985
14.1%
s 18670
12.5%
a 16564
11.1%
r 12957
8.7%
o 11634
7.8%
i 9082
6.1%
l 6304
 
4.2%
h 5359
 
3.6%
n 5114
 
3.4%
Other values (12) 19936
13.4%
Uppercase Letter
ValueCountFrequency (%)
E 10593
17.6%
A 9071
15.1%
N 8626
14.3%
C 7054
11.7%
W 5178
8.6%
D 5093
8.5%
H 3094
 
5.1%
M 2781
 
4.6%
P 2301
 
3.8%
B 1605
 
2.7%
Other values (8) 4810
8.0%
Space Separator
ValueCountFrequency (%)
26593
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5919
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5919
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 644
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 209125
84.3%
Common 39075
 
15.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 22314
 
10.7%
t 20985
 
10.0%
s 18670
 
8.9%
a 16564
 
7.9%
r 12957
 
6.2%
o 11634
 
5.6%
E 10593
 
5.1%
i 9082
 
4.3%
A 9071
 
4.3%
N 8626
 
4.1%
Other values (30) 68629
32.8%
Common
ValueCountFrequency (%)
26593
68.1%
) 5919
 
15.1%
( 5919
 
15.1%
- 644
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 248200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26593
 
10.7%
e 22314
 
9.0%
t 20985
 
8.5%
s 18670
 
7.5%
a 16564
 
6.7%
r 12957
 
5.2%
o 11634
 
4.7%
E 10593
 
4.3%
i 9082
 
3.7%
A 9071
 
3.7%
Other values (34) 89737
36.2%

Ward
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3456668
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:35.166448image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median5
Q37
95-th percentile11
Maximum11
Range10
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.4899953
Coefficient of variation (CV)0.46579695
Kurtosis0.022308613
Mean5.3456668
Median Absolute Deviation (MAD)1
Skewness0.37190311
Sum81051
Variance6.2000765
MonotonicityNot monotonic
2023-06-29T19:52:35.640269image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
5 6546
43.2%
1 1370
 
9.0%
8 1103
 
7.3%
7 1034
 
6.8%
3 1012
 
6.7%
4 903
 
6.0%
9 894
 
5.9%
11 851
 
5.6%
6 686
 
4.5%
2 614
 
4.0%
ValueCountFrequency (%)
1 1370
 
9.0%
2 614
 
4.0%
3 1012
 
6.7%
4 903
 
6.0%
5 6546
43.2%
6 686
 
4.5%
7 1034
 
6.8%
8 1103
 
7.3%
9 894
 
5.9%
10 149
 
1.0%
ValueCountFrequency (%)
11 851
 
5.6%
10 149
 
1.0%
9 894
 
5.9%
8 1103
 
7.3%
7 1034
 
6.8%
6 686
 
4.5%
5 6546
43.2%
4 903
 
6.0%
3 1012
 
6.7%
2 614
 
4.0%

BIA
Categorical

IMBALANCE 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size118.6 KiB
13660 
CK
 
476
MLT
 
370
PC
 
345
STR
 
207

Length

Max length3
Median length1
Mean length1.1439784
Min length1

Characters and Unicode

Total characters17345
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
13660
90.1%
CK 476
 
3.1%
MLT 370
 
2.4%
PC 345
 
2.3%
STR 207
 
1.4%
CLV 104
 
0.7%

Length

2023-06-29T19:52:36.098197image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-29T19:52:36.919761image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
ck 476
31.7%
mlt 370
24.6%
pc 345
23.0%
str 207
13.8%
clv 104
 
6.9%

Most occurring characters

ValueCountFrequency (%)
13660
78.8%
C 925
 
5.3%
T 577
 
3.3%
K 476
 
2.7%
L 474
 
2.7%
M 370
 
2.1%
P 345
 
2.0%
S 207
 
1.2%
R 207
 
1.2%
V 104
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Space Separator 13660
78.8%
Uppercase Letter 3685
 
21.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 925
25.1%
T 577
15.7%
K 476
12.9%
L 474
12.9%
M 370
 
10.0%
P 345
 
9.4%
S 207
 
5.6%
R 207
 
5.6%
V 104
 
2.8%
Space Separator
ValueCountFrequency (%)
13660
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13660
78.8%
Latin 3685
 
21.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 925
25.1%
T 577
15.7%
K 476
12.9%
L 474
12.9%
M 370
 
10.0%
P 345
 
9.4%
S 207
 
5.6%
R 207
 
5.6%
V 104
 
2.8%
Common
ValueCountFrequency (%)
13660
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17345
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13660
78.8%
C 925
 
5.3%
T 577
 
3.3%
K 476
 
2.7%
L 474
 
2.7%
M 370
 
2.1%
P 345
 
2.0%
S 207
 
1.2%
R 207
 
1.2%
V 104
 
0.6%

PIN_1
Real number (ℝ)

Distinct4448
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11691579
Minimum37200
Maximum33221400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size118.6 KiB
2023-06-29T19:52:37.578398image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum37200
5-th percentile1877200
Q15164700
median10176900
Q314792100
95-th percentile31141506
Maximum33221400
Range33184200
Interquartile range (IQR)9627400

Descriptive statistics

Standard deviation8290957.6
Coefficient of variation (CV)0.70913927
Kurtosis0.41033611
Mean11691579
Median Absolute Deviation (MAD)4717950
Skewness1.043898
Sum1.7726772 × 1011
Variance6.8739979 × 1013
MonotonicityNot monotonic
2023-06-29T19:52:38.286214image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6068300 204
 
1.3%
31141506 118
 
0.8%
9663800 114
 
0.8%
4407700 105
 
0.7%
12876900 82
 
0.5%
14804200 65
 
0.4%
33071400 58
 
0.4%
24265600 54
 
0.4%
17704200 50
 
0.3%
6076200 47
 
0.3%
Other values (4438) 14265
94.1%
ValueCountFrequency (%)
37200 4
 
< 0.1%
37400 13
0.1%
38300 3
 
< 0.1%
38400 6
< 0.1%
38500 1
 
< 0.1%
38600 5
 
< 0.1%
38800 1
 
< 0.1%
38900 2
 
< 0.1%
39300 1
 
< 0.1%
39800 1
 
< 0.1%
ValueCountFrequency (%)
33221400 1
 
< 0.1%
33216500 2
 
< 0.1%
33196500 2
 
< 0.1%
33196400 1
 
< 0.1%
33191500 27
0.2%
33176900 1
 
< 0.1%
33176700 1
 
< 0.1%
33171600 9
 
0.1%
33171400 1
 
< 0.1%
33162200 1
 
< 0.1%

Interactions

2023-06-29T19:51:37.472710image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:11.393223image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:15.036538image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:18.164930image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:21.820086image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:25.614514image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:29.344645image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:33.452811image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:38.168089image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:11.811379image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:15.421494image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:18.545408image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:22.269128image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:26.074428image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:29.855751image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:33.911782image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:39.015168image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:12.163049image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:15.797003image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:18.918580image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:22.682900image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:26.486660image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:30.332090image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:34.325033image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:39.738364image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:12.928996image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:16.169050image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:19.305525image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:23.202471image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:27.071251image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:30.852114image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:34.816788image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:40.419863image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:13.378019image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:16.560964image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:19.754062image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:23.670363image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:27.533151image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:31.482881image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:35.255382image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:41.268763image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:13.801716image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:16.895742image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:20.397803image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:24.103001image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:27.961793image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:32.001933image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:35.894176image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:42.470664image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:14.330710image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:17.462074image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:20.919560image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:24.661819image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:28.449566image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:32.518013image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:36.661081image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:43.035383image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:14.692821image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:17.788910image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:21.330828image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:25.052230image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:28.850895image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:32.958591image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-29T19:51:37.031617image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-29T19:52:38.952315image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
XYFIDIDNAICSCodeStreetNoWardPIN_1EmplRangeNAICSTitleBIA
X1.0000.045-0.030-0.0400.034-0.249-0.703-0.1280.0500.0910.348
Y0.0451.000-0.018-0.010-0.2270.312-0.1670.0540.0470.1580.433
FID-0.030-0.0181.0000.9810.0470.0220.0370.1060.0450.0740.154
ID-0.040-0.0100.9811.0000.0430.0320.0420.1130.0500.0820.083
NAICSCode0.034-0.2270.0470.0431.000-0.1010.0330.0010.0790.9080.069
StreetNo-0.2490.3120.0220.032-0.1011.0000.2290.0780.0460.0770.286
Ward-0.703-0.1670.0370.0420.0330.2291.0000.1160.0580.1490.425
PIN_1-0.1280.0540.1060.1130.0010.0780.1161.0000.0600.0920.276
EmplRange0.0500.0470.0450.0500.0790.0460.0580.0601.0000.1150.068
NAICSTitle0.0910.1580.0740.0820.9080.0770.1490.0920.1151.0000.094
BIA0.3480.4330.1540.0830.0690.2860.4250.2760.0680.0941.000

Missing values

2023-06-29T19:51:44.415129image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-29T19:51:46.029424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-06-29T19:51:47.021822image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

XYFIDIDNameEmplRangeNAICSCodeNAICSTitleNAICSDescrPhoneFaxTollFreeEMailWebAddressStreetNoStreetNamePostalCodeBuildingNoUnitNoModifiedCHAreaWardBIAPIN_1
0-79.70669843.60332112791Thai Rice1 to 4722512AccommodationLimited-service eating places905-821-8999www.thairice.ca1201Britannia Rd WL5V 1N222019/09/19 00:00:00+00East Credit NHD114431600
1-79.70669843.60332122792Vintners Cellar Britannia Inc.1 to 4312130ManufacturingWineries905-813-3436vintnersgta@gmail.comwww.vintnerscellargta.com1201Britannia Rd WL5V 1N2162021/05/21 00:00:00+00East Credit NHD114431600
2-79.72244243.58919032794UK Insurance Brokers Inc1 to 4524210FinanceInsurance Agencies and Brokerages905-567-7003905-567-6003Info@ukinsurance.cawww.ukinsurance.ca1965Britannia Rd WL5M 4Y42002022/06/29 00:00:00+00Streetsville NHD1113210900
3-79.72244243.58919042795Prestige Pools & Leisure Products Ltd.5 to 9453999RetailAll Other Miscellaneous Store Retailers (except Beer and Wine-Making Supplies Stores)905-542-1505info@prestigepools.cawww.prestigepools.ca1965Britannia Rd WL5M 4Y452022/06/29 00:00:00+00Streetsville NHD1113210900
4-79.72244243.58919052798Grocery Cafe1 to 4445110RetailSupermarkets and Other Grocery (except Convenience) Stores905-821-8960TheNextVentureInc@outlook.com1965Britannia Rd WL5M 4Y442022/06/29 00:00:00+00Streetsville NHD1113210900
5-79.72244243.58919062800Carolyn's Model and Talent Agency Ltd1 to 4711411ArtsAgents and Managers for Artists, Athletes, Entertainers and Other Public Figures905-542-8885905-542-8887info@carolynsonline.comwww.carolynsonline.com1965Britannia Rd WL5M 4Y42102019/09/23 00:00:00+00Streetsville NHD1113210900
6-79.72244243.58919072801Jaz Beauty Salon1 to 4812115Other ServicesBeauty Salons905-542-1716jazbeauty664@gmail.com1965Britannia Rd WL5M 4Y482022/06/29 00:00:00+00Streetsville NHD1113210900
7-79.72244243.58919082802Great Travel Fares1 to 4561510AdministrativeTravel Agencies905-270-41111-866-483-8222travel@greattravelfares.comwww.greattravelfares.com1965Britannia Rd WL5M 4Y42122019/09/19 00:00:00+00Streetsville NHD1113210900
8-79.72767943.58297692805Dr.Nivine Ghobrial Dentistry Professional Corporation20 to 49621210Health CareOffices of Dentists905-821-8632905-821-4515britanniadental@rogers.com2258Britannia Rd WL5M 2G82022/07/07 00:00:00+00Streetsville NHD1115765600
9-79.73326143.577724102808Britannia Esso5 to 9447110RetailGasoline Stations with Convenience Stores905-821-40792520Britannia Rd WL5M 5X72022/07/06 00:00:00+00Central Erin Mills NHD95101000
XYFIDIDNameEmplRangeNAICSCodeNAICSTitleNAICSDescrPhoneFaxTollFreeEMailWebAddressStreetNoStreetNamePostalCodeBuildingNoUnitNoModifiedCHAreaWardBIAPIN_1
15152-79.65584643.6461601515395026Can-carib Trading corp.1 to 4419120WholesaleWholesale Trade Agents and Brokers905-696-0082Inquiries@cancaribitrading.com1060Britannia Rd EL4W 4T12022/05/13 00:00:00+00Northeast EA (West)514791700
15153-79.64149943.5995981515495027Bao Bar10 to 19722512AccommodationLimited-service eating places905-803-8222baosandwichbar@gmail.comwww.baosandwichbar.com4310Hurontario StL4Z 3X752022/08/03 00:00:00+00DT Core43602200
15154-79.58049243.5560421515595028Palm Bites1 to 4445299RetailAll Other Specialty Food Stores905-565-5898www.palmbites.ca167Lakeshore Rd EL5G 4T92022/07/07 00:00:00+00Port Credit CN1PC22355800
15155-79.61565043.5815901515695031Magikal Spiritual Supplies1 to 4453999RetailAll Other Miscellaneous Store Retailers (except Beer and Wine-Making Supplies Stores)647-784-7884magikaldoor@gmail.comwww.metexxqueen.com39Dundas St EL5A 1V91052022/08/22 00:00:00+00DT Cooksville7CK2937000
15156-79.56915743.5931241515795032Tek Klinik1 to 4811210Other ServicesElectronic and Precision Equipment Repair and Maintenance905-278-7189www.tekklinik.com1250South Service RdL5E 1V4802022/06/01 00:00:00+00Lakeview NHD112876900
15157-79.70533143.5196661515895033Elesa1 to 4326198ManufacturingAll Other Plastic Product Manufacturing905-916-1101289-859-7587info@elesacanada.comwww.Elesa.com3600Laird RdL5L 6A9B32022/05/11 00:00:00+00Western Business Park EA824811600
15158-79.56784943.6228271515995034Enza Home1 to 4442110RetailFurniture Stores905-277-4777www.enzahome.com2050Dundas St EL4X 1L922022/05/24 00:00:00+00Dixie EA112573700
15159-79.59335143.5998721516095035Panda Hobby1 to 4451120RetailHobby, Toy and Game Stores1-855-726-3280www.pandahobby.ca966Dundas St EL4Y 4H552022/06/09 00:00:00+00Dixie EA112568800
15160-79.63621343.6285431516195038City Sign & Printing1 to 4323119ManufacturingOther Printing416-837-7050mohd_sk@hotmail.com5151Everest DrL4W 2Z35 & 62022/06/03 00:00:00+00Northeast EA (West)55113500
15161-79.61380243.6326201516295041Cellar Nation1 to 4811210Other ServicesElectronic and Precision Equipment Repair and Maintenance647-388-7667www.cellarnation.ca1550South Gateway RdL4W 5G62022/08/09 00:00:00+00Northeast EA (West)314804200